Incorporating Statistical Information of Lexical Dependency into a Rule-Based Parser

نویسندگان

Yoon-Hyung Roh

Ki-Young Lee

Young-Gil Kim

چکیده

This paper presents a method to incorporate statistical information into a rulebased parser to resolve syntactic ambiguities. We extract the statistical information from the Penn Treebank, and apply the information to the rule-based parser. For the extraction of the statistical information the tag conversion is needed because of the disagreement of the tags and the bracketing style. We will show the effect of the tag conversion with experiments. The final result shows about 7% error rate reduction in the dependency evaluation. We will also show how much each type of statistical information affects the parsing performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائۀ راهکاری قاعده‌مند جهت تبدیل خودکار درخت تجزیۀ نحوی وابستگی به درخت تجزیۀ نحوی ساخت‌سازه‌ای برای زبان فارسی

In this paper, an automatic method in converting a dependency parse tree into an equivalent phrase structure one, is introduced for the Persian language. In first step, a rule-based algorithm was designed. Then, Persian specific dependency-to-phrase structure conversion rules merged to the algorithm. Subsequently, the Persian dependency treebank with about 30,000 sentences was used as an input ...

متن کامل

New Parsing Method Using Global Association Table

This paper presents a new parsing method using statistical information extracted from corpus, especially for Korean. The structural ambiguities are occurred in deciding the dependency relation between words in Korean. While guring out the correct dependency, the lexical associations play an important role in resolving the ambiguities. Our parser uses statistical cooccurrence data to compute the...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Can Subcategorization Help a Statistical Dependency Parser?

Today there is a relatively large body of work on automatic acquisition of lexicosyntactical preferences (subcategorization) from corpora. Various techniques have been developed that not only produce machinereadable subcategorization dictionaries but also they are capable of weighing the various subcategorization frames probabilistically. Clearly there should be a potential to use such weighted...

متن کامل

Improving the Usability of Statistical Parsers by Incorporating Linguistic Constraints

Statistical systems with high accuracy are very useful in real-world applications. If these systems can capture basic linguistic information, then the usefulness of these statistical systems improve a lot. This paper is an attempt at incorporating linguistic constraints in statistical dependency parsing. We consider a simple linguistic constraint that a verb should not have multiple subjects/ob...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Incorporating Statistical Information of Lexical Dependency into a Rule-Based Parser

نویسندگان

چکیده

منابع مشابه

ارائۀ راهکاری قاعده‌مند جهت تبدیل خودکار درخت تجزیۀ نحوی وابستگی به درخت تجزیۀ نحوی ساخت‌سازه‌ای برای زبان فارسی

New Parsing Method Using Global Association Table

Feature Engineering in Persian Dependency Parser

Can Subcategorization Help a Statistical Dependency Parser?

Improving the Usability of Statistical Parsers by Incorporating Linguistic Constraints

عنوان ژورنال:

اشتراک گذاری